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We study the limiting spectral measure of large symmetric ran- 
dom matrices of linear algebraic structure. 

For Hankel and Toeplitz matrices generated by i.i.d. random vari- 
ables {Xk} of unit variance, and for symmetric Markov matrices gen- 
erated by i.i.d. random variables {X i j}j >i of zero mean and unit 
variance, scaling the eigenvalues by y/n we prove the almost sure, 
weak convergence of the spectral measures to universal, nonrandom, 
symmetric distributions "/h, 7m and 7t of unbounded support. The 
moments of 'Jh and jt are the sum of volumes of solids related to 
Eulerian numbers, whereas 7m has a bounded smooth density given 
by the free convolution of the semicircle and normal densities. 

For symmetric Markov matrices generated by i.i.d. random vari- 
ables {Xij}j > i of mean m and finite variance, scaling the eigenvalues 
by n we prove the almost sure, weak convergence of the spectral mea- 
sures to the atomic measure at —m. If m = 0, and the fourth moment 
is finite, we prove that the spectral norm of M„ scaled by y/2n log n 
converges almost surely to 1. 



1. Introduction and main results. For a symmetric n x n matrix A, 
let Aj(A), 1 < j < n, denote the eigenvalues of the matrix A, written in 
a nonincreasing order. The spectral measure of A, denoted /t(A), is the 
empirical distribution of its eigenvalues, namely 
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[so when A is a random matrix, /t(A) is a random measure on (R, B)]. 

Large-dimensional random matrices are of much interest in statistics, 
where they play a pivotal role in multivariate analysis. In his seminal paper, 
Wigner [24] proved that the spectral measure of a wide class of symmet- 
ric random matrices of dimension n converges, as n — > oo, to the semicircle 
law (also called the Sato-Tate measure, see [21] and the references therein). 
Much work has since been done on related random matrix ensembles, either 
composed of (nearly) independent entries, or drawn according to weighted 
Haar measures on classical (e.g., orthogonal, unitary, simplectic) groups. 
The limiting behavior of the spectrum of such matrices and their composi- 
tions is of considerable interest for mathematical physics (see [17] and the 
references therein). In addition, such random matrices play an important 
role in operator algebra studies initiated by Voiculescu, known now as the 
free (noncommutative) probability theory (see [12] and the many references 
therein). The study of large random matrices is also related to interesting 
questions of combinatorics, geometry and algebra (see [9], or, e.g., [22]). In 
his recent review paper [1], Bai proposes the study of large random ma- 
trix ensembles with certain additional linear structure. In particular, the 
properties of the spectral measures of random Hankel, Markov and Toeplitz 
matrices with independent entries are listed among the unsolved random 
matrix problems posed in [1], Section 6. We shall provide here the solution 
for these three problems. 

We note in passing that Hankel matrices arise, for example, in polynomial 
regression, as the covariance for the least squares parameter estimation for 
the model Y%=o ^iX l , observed at the presence of additive 

noise (see [20], page 36). Toeplitz matrices appear as the covariance of sta- 
tionary processes, in shift-invariant linear filtering, and in many aspects of 
combinatorics, time series and harmonic analysis. See [10] for classical re- 
sults on deterministic Toeplitz matrices, or [7] and the references therein, 
for their applications to certain random matrices. The infinitesimal genera- 
tors of continuous-time Markov processes on finite state spaces are given by 
matrices with row-sums zero (which we call Markov matrices). Such matri- 
ces also play an important role in graph theory, as the Laplacian matrix of 
each graph is of this form, with its eigenvalues related to numerous graph 
invariants; see [15]. 

We next specify the corresponding ensembles of random matrices studied 
here. Let {X^ : k = 0, 1, 2, . . .} be a sequence of i.i.d. real-valued random vari- 
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ables. For n G N, define a random n x n Hankel matrix H n = pQ + j_i]i<i ,j< n , 
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and a random 71 x ti ToGplitz matrix T r 



(1.2) 



The limiting spectral distribution for a Toeplitz matrix T n is as follows. 

Theorem 1.1. Let {Xk-k = 0,1,2, .. .} be a sequence of i.i.d. real-valued 
random variables with Var(Xi) = 1. Then with probability 1, £i(T n /^/n) 
converges weakly as n — > 00 to a nonrandom symmetric probability measure 
7r which does not depend on the distribution of X±, and has unbounded 
support. 

The spectrum of nonrandom Toeplitz matrices, the rows of which are 
typically absolutely summable, is well approximated by its counterpart for 
circulant matrices (cf. [10], page 84). In contrast, note that the limiting dis- 
tribution 7r is not normal as the calculation shows that the fourth moment 
is 1714 = 8/3. This differs from the analogous results for random circulant ma- 
trices (see [4]), a fact that has been independently noticed also in references 
[3, 11]. 

Our next result gives the limiting spectral distribution for a Hankel matrix 
H n . 

Theorem 1.2. Let {Xj.:k = 0, 1,2, ...} be a sequence of i.i.d. real-valued 
random variables with Var(Xi) = 1. Then with probability 1, fi(H. n /^/n) 
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£(H 500 /\/500) £(T 500 /v / 500) 

Fig. 1. Histograms of the empirical distribution of eigenvalues of 100 realizations of the 
Hankel and Toeplitz matrices with standardized triangular U — U' entries. 

converges weakly as n — > oo to a nonrandom symmetric probability measure 
7# which does not depend on the distribution of X\ , has unbounded support 
and is not unimodal. 

(Recall that a symmetric distribution v is said to be unimodal, if the 
function x \— > oo,x]) is convex for x < 0.) 

Remark 1.1. Theorems 1.1 and 1.2 fall short of establishing that the 
limiting distributions have smooth densities and that the density of jh is 
bimodal. Simulations suggest that these properties are likely to be true; see 
Figure 1. 



Remark 1.2. Consider the empirical distribution of singular values of 
the nonsymmetric random Ti x ti Toeplitz matrix R.^ — [-^-2— j]i<i,j<n* 

It fol- 
lows from Theorem 1.2 that as n — > oo, with probability 1, /{((R^R^) 1 / 2 /^) -> 
v weakly, where v([0,x]) =j H ([-x,x]),x > 0. Indeed, let J n = [±i+j= n +i]i<i,j<n> 
noting that J n x is the Hankel matrix H n for {A^._ n : k = 0, 1, . . .} to 
which Theorem 1.2 applies. S ince J n — I n , and both 3 n and 3 n x R n are 
symmetric, we have R n R-n = (RnJn) T JnRn = H^. Thus the singular val- 
ues of matrix R n are the absolute values of the (real) eigenvalues of the 
symmetric Hankel matrix H n . 

We now turn to the Markov matrices M n . Let {Xij-.j > i > 1} be an 
infinite upper triangular array of i.i.d. random variables and define Xji = Xu 
for j > i > 1. Let M n be a random n x n symmetric matrix given by 



(1.3) 



M n = X n - D. 
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where X n = [Xij]i< i: j< n and D n = diag(J]j=i ^ij)i<i<n is a diagonal ma- 
trix, so each of the rows of M n has a zero sum (note that the values of Xn 
are irrelevant for M n ), that is, 



n 


X\2 


-Xl3 




X\ n 


J =2 












n 












-^23 




X2n 












Xki 


-^fc2 




n 

- E Xfc j " ■ 


Xkn 



n-'l 

X n l X n 2 ■ ■ ■ — ^ X n j 

5=1 

Wigner's classical result says that p,(X. n /y/n) converges weakly asn->cxD 
to the (standard) semicircle law with the density y/A — x 2 / (2tt) on (—2,2). 
For normal X n and normal i.i.d. diagonal D n independent of X n , the weak 
limit of /i((X n — D n )/y / n) is the free convolution of the semicircle and 
standard normal measures; see [17] and the references therein (see also [2] 
for the definition and properties of the free convolution). This predicted 
result holds also for the Markov matrix M n , but the problem is nontrivial 
because D n strongly depends on X n . 

Theorem 1.3. Let {Xij :j>i> 1} be a collection of i.i.d. random vari- 
ables with = and Var(Xi2) = 1. With probability 1, jl(JsA n / yjn) con- 
verges weakly as n — > 00 to the free convolution 7m of the semicircle and 
standard normal measures. This measure is a nonrandom symmetric 
probability measure with smooth bounded density, does not depend on the 
distribution of X12 and has unbounded support. 

If the mean of X^ is not zero, the following result is relevant. 

Theorem 1.4. Let {X^ :i, j G N, j > i > 1} be a collection of i.i.d. ran- 
dom variables with KX12 =m and ^Xf 2 < 00. Then /}(M n /n) converges 
weakly to 8- m as n — ► 00. 

Turning to the asymptotic of the spectral norm |||M n ||| := max{Ai(M n ), 
— A n (M n )} of the symmetric matrix M ra , that is, the largest absolute value 
of its eigenvalues, we have the following 
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Theorem 1.5. Let {Xij :i, j € N,j > i > 1} be a collection of i.i.d. ran- 
dom variables with EX12 = 0, Var(A"i2) = 1 and EXf 2 < 00. Then 



y ll|M n ||| 

Iim = = 1 a.s. 

n-»oo ^/2nlogn 

If the mean of X^ is not zero, the following result is relevant. 
Corollary 1.6. Suppose EA 12 = m and EXf 2 < 00. Then 

V lllMnlll , , 

lim = m a.s. 

n — >oo fi 

Theorem 1.5 reveals a scaling in n that differs from that of the spectral 
norm of Wigner's ensemble, where under the same conditions, almost surely, 

(1.4) lim " 



(cf. [1], Theorem 2.12). As shown in Section 2 enroute to proving Theorems 
1.4, 1.5 and Corollary 1.6, this is due to the domination of the diagonal 
terms of M n in determining its spectral norm. 

Remark 1.3. The asymptotic of the spectral norm of random Toeplitz 
T n and Hankel H n matrices is not addressed in this work. 

Theorems 1.4, 1.5 and Corollary 1.6 are proved in Section 2. The proofs 
of Theorems 1.1 and 1.2, which are similar to each other, ultimately rely on 
the method of moments and the well-known relation 



/ 



x k fi{A){dx) = -tr A fc 
n 



for annxn symmetric matrix A. We begin in Section 3 by introducing the 
combinatorial structures which describe the moments of the limiting distri- 
butions. (Proofs of the properties of the limiting distributions are postponed 
to the Appendix.) Then in Section 4.1 we use truncation arguments to re- 
duce the theorems to the case when the expected values of the moments of 
the spectral measures are finite. In Section 4.2 we show that under suitable 
integrability assumptions the expected values of moments of the spectral 
measures converge to the corresponding expressions from Section 3 as the 
size of the matrix n — > 00. Representing the moments as traces, we use 
independence of the entries and combinatorial arguments to discard the ir- 
relevant terms in the expansions (4.7) and (4.12). In Section 4.4 we show 
that the moments of the spectral measures are concentrated around their 
means, which allows us to conclude the proofs in Section 4.5. 

The proof of Theorem 1.3 follows a similar plan, with truncation argument 
in Section 4.1, followed by combinatorial analysis of expansion (4.17) for the 
traces and concentration of moments in Section 4.6. 
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2. Proofs of Theorems 1.4, 1.5 and Corollary 1.6. We need the following 
result, which follows by Chebyshev's inequality from [18], Section 6, Theo- 
rem 5 or [19], Section 5, Corollary 5. 

Lemma 2.1 (Sakhanenko). Let i = 1,2,...} be a sequence of in- 
dependent random variables with mean zero and = o~f. If E|£j| p < oo 
for some p > 2, then there exists a constant C > and {rji, i = 1,2,...}, 
a sequence of independent normally distributed random variables with rji ~ 
N(0,af) such that 



max \Sk-T k \ >x) < 
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i=l 



for any n and x > 0, where Sk = J2i=i £i an d Tk = Z)f=i Vi- 



Proof of Theorem 1.5. Hereafter let b(n) = ^J2n log n denote the 
normalization function for Theorem 1.5. 

It follows from (1.3) that ||||M n ||| - |||D n |||| < |||X n |||. So, by (1.4) and the 
definition of D n , it suffices to show that as n — > oo, 
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Note that {JQ,-; j > 1} is a sequence of i.i.d. random variables for each 
i > 1. By Lemma 2.1 and the condition that E|Xi2| 4 < oo, for each i > 1, 
there exists a sequence of independent standard normals {Y^j > 1} such 
that 
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for all x > and n. > 1, where C is a constant which does not depend on n 
and x (note that two sequences {Yij]j > 1} for different values of i are not 
independent of each other). We claim that 
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By (2.3), for any e > 0, 



max U k > e < 2" 
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for some constant C e depending only on e. Since e > is arbitrary, by the 
Borel-Cantelli lemma, max5 =2 m Z/fc — ^ a.s. asra-> oo, which implies (2.4). 
Let 
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By the definitions in (2.1) and (2.4), we have that W n < J7 n + V n , so by (2.4) 
we get (2.2) as soon as we show that lim sup ri _ >00 V n < 1. To this end, fix 
5 > and a > 1/5. Then, 
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where Levy's inequality is used in the second step. Since i^'s are indepen- 
dent standard normals, £ := (m + l)~ a / 2 X)j=i is a standard normal 
random variable. Thus, by the well-known normal tail estimate 

1 X '2 10 . „. , . x 11 
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we see that 

P(|f | > (1 + <5)(m + l)- a / 2 6(?n Q )) < C^m"^ 1 ^ 
for some constant C$ > 0. Consequently, for some C' s > and all m, by (2.5), 

{ nti^V n >l + 6]<C' 8 m- aS . 

n=m a J 

With oiS > 1, we have by the Borel-Cantelli lemma that 

f (m+l) Q 1 

limsup< max V n > < 1 + 5 a.s. 
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It follows that limsup n _ +00 V n < 1 + 5 a.s. and taking 5 1 we obtain (2.2) 
We next prove that 



(2.7) 
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By (2.2), limsup.^^ W Ue < 1 a.s. Thus, with b(n e )/b(n) — > as n — > oo, we 
have that 
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With 6(n) > -^/n, by Lemma 2.1 there exists a sequence of independent stan- 
dard normals {Yj} such that for some C = C(5) < oo and all n 
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Further, by the left inequality of (2.6) we have that for all n sufficiently 
large, 
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Combining this bound with (2.11) and (2.10) we get that for all n large 
enough 

P(K,i < 1 - 35) < (1 - 2n~( x -V + Cn~ l ) n ^ 
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Recall that e > 5, implying that J2n>i < 1 — 3<5) < oo. By the Borel- 

Cantelli lemma, 

liminf V n i > 1 — 35 a.s. 

n— >oo ' 

This together with (2.8) and (2.9) implies that almost surely liminf n ,^oo W n > 
1 — 3(5, and the lower bound (2.7) follows by taking <5 j 0. □ 

Proof of^Corollary 1.6. Let M„ denote the Markov matrix ob- 
tained when Xij = Xij — EA^- replaces X^ in (1.3). Obviously, 

(2.12) M„ = M n + Y n , 

where Y n = \Yij] is the n x n matrix with = m — nmli = j. Clearly, 
Ai(Y n ) = 0, A 2 (Y n ) = • • • = A n (Y n ) = -nm, so |||Y n ||| = n\m\. By (2.12) and 
Theorem 1.5, we have that 



n n n 

as n — > oo. This implies that |||M n |||/n — ► \m\ a.s. □ 

In the context of this paper, the next lemma is very handy for truncation 
purposes. 

Lemma 2.2. Let {X^ : j > i > 1} be an infinite triangular array of i.i.d. 
random variables with EA"i2 = and Var(Xi2) = cr 2 . Let Xji = Xij for i < j 
and set Xa = for all i>l. Then, 

1 n / n \ 2 

— E(E*y) 



a 2 a.s. 



i=i \j=i 



as oo. 



Proof. Define 

(2.13) U n :=Y, E X H X ik- 

i=l l<j<k<n 

Then 

-y n / n \ 2 n n 

= ^EE^' + ^- 

i=i \j=i / i=i ]=i 

By the strong law of large numbers, the first term on the right-hand side 
converges almost surely to a 2 , so it suffices to show that 

(2.14) a.s. 

n z 
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To this end, denote by the <r-algebra generated by the random variables 
{Xij, 1 < i, j < k}. Noting that 

n n 

U n+ i — U n = ^2 X(n+l)jX(n+i)k + /2 X ijXi(n+l)i 
l<j<k<n i=l j=l 

it is easy to verify that {U n : n > 1} is a martingale for the filtration \T n : n > 
1}. Further, the n 2 (n — 1) /2 terms in the sum (2.13) are uncorrelated. Indeed, 
if % ^ i' and j < k, j' < k', then E(Xj,-XjfeXj/j/Xj/fc/) = as at least one 
of the four variables in this product must be independent of the others. 
Thus, E,(U 2 ) < <r 4 ?i 2 (n — l)/2 for any n > 2, and by Doob's submartingale 
inequality 

m A e] < E ^™^ < ^ 



i<i<m 2 ' " J rrfie 2 m 2 e 2 

It follows by the Borel-Cantelli lemma, that almost surely 

Z m := m -4 max |?7i|— >0, 

l<i<m 2 

as m — > oo. Since n~ 2 \U n \ < (m/(m — l)) A Z rn whenever (m — I) 2 <n< m 2 , 
m > 2, we thus get (2.14). □ 

Let d-Qh denote the bounded Lipschitz metric 

(2.15) d B hM = supjy fdfJL- |/^:||/||oo + ||/||l<1 

where ||/||oo = sup 6 |/(a;)|, \\f\\ L = sup Xf L y \f{x)-f(y)\/\x-y\. It is well 
known (see [8], Section 11.3) that d-Qh is a metric for the weak convergence 
of measures. For the spectral measures of n x n symmetric real matrices 
A, B we have 

d B L(A(A), A(B)) < supji jr \f(Xj(A)) - /(Aj(B))| : ||/|| L < l| 
1 n 

<-Ei A i( A )- A i( B )l- 

3=1 

By Lidskii's theorem ([13], see also [1], Lemma 2.3) 

n 

^|A J (A)-A J (B)| 2 <tr((B-A) 2 ), 

3=1 

so 

(2.16) 4 L (/i(A),/i(B)) < ~tr((B - A) 2 ). 
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Proof of Theorem 1.4. We use the notation from the proof of Corol- 
lary 1.6 and write a 2 = Var(Xn). By (2.12) and (2.16) the bounded Lipschitz 
metric (2.15) satisfies 



as n — > oo. Recall that all but one of the eigenvalues of Y n are —nm, hence 
fi{Y n /n) converges weakly to 6- m . Combining this with (2.17) and (2.18), 
we have that almost surely, fi(M. n /n) converges weakly to 8- m . □ 

3. The limiting distributions 7//, 'jm and -y^. 

3.1. Moments. For a probability measure 7 on (M,£>), denote its mo- 
ments by 



The probability measures 7^, jm and 7t will be determined from their 
moments. It turns out that the odd moments are zero, and the even moments 
are the sums of numbers labeled by the pair partitions of {1, ... , 2k}. 

It is convenient to index the pair partitions by the partition words w; 
these are words of length \w\ =2k with k pairs of letters such that the first 
occurrences of each of the k letters are in alphabetic order. In the case k = 2 
we have 1x3 such partition words 



which correspond to the pair partitions 

{1,2}U{3,4} {1,4}U{2,3} {1,3} U {2,4} 

of {1,2,3,4}. Recall that the number of pair partitions of {1, ... ,2k} is 
1 x 3x ••• x (2k -I). 

Definition 3.1. For a partition word w, we define its height h(w) as the 
number of encapsulated partition subwords, that is, substrings of the form 
xwix, where x is a single letter, and w\ is either a partition word or the 
empty word. 





aabb 



abba 



abab 
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For example, h{abcabc) = 0, h(abcbca) = h(abccab) = 1, while h(aabbcc) = 
h(abccba) = 3 (the encapsulating pairs of letters are underlined). 

In the terminology of Bozejko and Speicher [5], h assigns to a pair partition 
the number of connected blocks which are of cardinality 2. These connected 
blocks of cardinality 2 are the pairs of letters underlined in the previous 
examples. 

In Proposition A. 5 we show that the even moments of the free convolution 
7jv/ of the semicircle and standard normal measures are given by 

(3.1) m 2k ( lM )= Y, 2HW) - 

w : |u>|=2fc 

For the Toeplitz and Hankel cases, with each partition word w we associate 
a system of linear equations which determine the cross section of the unit 
hypercube, and define the corresponding volume p(w). We have to consider 
these two cases separately. 

3.2. Toeplitz volumes. Let w[j] denote the letter in position j of the word 
w. For example, if w = abab, then w[l] = a, w[2] = b,w[3] = a,w[4] = b. 

To every partition word w we associate the following system of equations 
in unknowns xq,x\, . . . ,X2k'- 

X\ Xq + x mi x mi ^\ — 0, 

if mi > 1 is such that w[l] = w[mi], 

X2 X\ + x Tn , 1 x m2 — i — 0, 

if there is m,2 > 2 such that w[2] = w[m,2], 



(3.2) 

%i Xi— 1 ~\~ X m . X m . — \ — 0, 



if there is mj > i such that w[i] = w[m,i 



X2k-1 ~ X2k-2 + X 2 k ~ X 2 k-1 = 0, 

if w[2k-l] =w[2k]. 

Although we list 2k — 1 equations, in fact k — 1 of them are empty. Informally, 
the left-hand sides of the equations are formed by adding the differences 
over the same letter when the variables are written in the space "between 
the letters." For example, writing the variables between the letters of the 
word w = ababc.c. we get 

(3.3) 
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The corresponding system of equations is 



x\ - x + x 3 - x 2 = 



X2 — x\ + X4 — 23 = 



(3.4) 



X5 — X4 + x n +i — x n — 0, 



Since in every partition word w of length 2k there are exactly k distinct 
letters, this is the system of k equations in 2k + 1 unknowns. We solve it 
for the variables that follow the last occurrence of a letter, leaving us with 
k + 1 undetermined variables: xq, and the k variables that follow the first 
occurrence of each letter. 

We then require that the dependent variables lie in the interval / = [0, 1]. 
This determines a cross section of the cube I k+1 in the remaining undeter- 
mined k + 1 coordinates, the volume of which we denote by pt(w). For exam- 
ple, if w = abab, solving the first pair of equations (3.4) for X3 = xq — x\ + X2, 
X4 = xq, defines the solid 



which has the (Eulerian) volume pT(abab) = 4/3! =2/3. 

We define measure 7t as a symmetric measure with even moments 



From Proposition 4.5 below it follows that (3.5) indeed defines a positive 
definite sequence of numbers so that these are indeed the even moments of a 
probability measure. Since 1712k is at most the number {2k — 1)!! of words of 
length 2k, these moments determine the limiting distribution jj- uniquely. 

3.3. Hankel volumes. We proceed similarly to the Toeplitz case. With 
each partition word w we associate the following system of equations in 
unknowns xq,x\,. . . , X2k- 



{x - xi + x 2 G /} n {x el}cl 3 



(3.5) 



m 2 k{lT) = Pt{w). 



w : |ui|=2fc 



X\ T" Xq — X mi -f" X mi — \, 

if mi > 1 is such that w[l] = w[mi], 
X2 ~\~ X\ — x m2 ~\~ x m2 — i, 

if there is rri2 > 2 such that w[2] = ^[7712] 



(3.6) 
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= w[mj], 



X2k-1 + X2k-2 = X 2 k + X 2 k-1, 

if w[2k-l]=w[2k]. 

Informally, the equations are formed by equating the sums of the variables 
at the same letter. For example, the word abab with the variables written 
as in (3.3) gives rise to the system of equations 

Xl + 2 = 2 3 + 2 2 , 

(3.7) 

22 + 2i = 2 4 + X 3 . 

As in the Toeplitz case, since there are exactly k distinct letters in the 
word, this is the system of k equations in 2k + 1 unknowns. We solve it for 
the variables that precede the first occurrence of a letter, leaving us with 
k undetermined variables . . . , x ai , . . . , x ak = x 2 k-i that precede the second 
occurrence of each letter, and with the (k + l)st undetermined variable x^k- 
We add to the system (3.6) one more equation: 

xo = x 2 k- 

As previously, we require that the dependent variables are in the interval 
/ = [0,1]. This determines a cross section of the cube I k+l in the remaining 
k + 1 coordinates with the volume which we denote by pn{w)- 

Due to the additional constraint x 2 k = xq, this volume might be zero. For 
example, (3.7) has solutions xq = 2x2 — 24,21 = 23 — X2 4- X4 with undeter- 
mined variables 22 , X3 , 24 . Equation xo = 24 gives additional relation 24 = 
22, and reduces the dimension of the solid {222 — 24 G /} n {23 — 22 4- 24 G 
1} n {24 = 22} C I 3 to 2. Thus the corresponding volume is puiabab) = 0. 

We define measure jh as a symmetric measure with even moments 

(3.8) rnzkim) = Ph(w). 

w : |«j|=2fc 

From Proposition 4.7 below it follows that (3.8) indeed defines a positive 
definite sequence of numbers so that these are indeed the even moments of a 
probability measure. Since ni2k is at most the number {2k — 1)!! of words of 
length 2k, these moments determine the limiting distribution uniquely. 

3.4. Relation to Eulerian numbers. The Eulerian numbers A n ^ m are of- 
ten defined by their generating function or by the combinatorial description 
as the number of permutations a of {l,...,n} with Cj > Oi-\ for exactly 
m choices of i = 1, 2, . . . ,n (taking ctq = 0). The geometric interpretation is 
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that A n ^ m /n\ is the volume of a solid cut out of the cube I n by the set 
{x\ + ■ ■ ■ + x n E [m — 1, m]}; see [23]. Converting any m — 1 of the coordi- 
nates x to 1 — x, we get that A n ^ m /n\ is the volume of a solid cut out of the 
cube I n by the set 

{(xi , . . . , x n ) 6 M. n : xi + x 2 H h x 

n—m \%n— m+1 

The solids we encountered in the formula for the 2fcth moments are the 
intersections of solids of this latter form, with odd values of re, each having 
m = (n — l)/2, and with various subsets of the coordinates entering the 
expression. 

Another interesting representation is 

Vol({(xi,...,x n ) G/ n :xi+x 2 H hx 

n—m (^n— m+1 

+ --- + x n ) el}) 



IT JO V t J 

This follows from the integral representation of Eulerian numbers in Nico- 
las [16]. 

Remark 3.1. One can verify that the probabilities pt{w) and ph(w) 
are rational numbers, and hence so are m2k("fT) and rre2fc(7#), defined by 
formulas (3.5) and (3.8) (for details, cf. [6]). 

4. Proofs of Theorems 1.1, 1.2 and 1.3. 

4.1. Truncation and centering. We first reduce Theorems 1.1, 1.2 and 
1.3 to the case of bounded i.i.d. random variables, and in case of Theorems 
1.1 and 1.2, also allow for centering of these variables. 

PROPOSITION 4.1. (i) // Theorem 1.1 or Theorem 1.2 holds true for all 
bounded independent i.i.d. sequences {Xj} with mean zero and variance 1, 
then it holds true for all square-integrable i.i.d. sequences {Xj} with variance 
1. 

(ii) // Theorem 1.3 holds true for all bounded independent i.i.d. collections 
{Xij} with mean zero and variance 1, then it holds true for all square- 
integrable i.i.d. collections {Xij} with mean zero and variance 1. 

Proof. Without loss of generality, we may assume that E(Xi) = in 
Theorems 1.1 and 1.2. Indeed, from the rank inequality ([1], Lemma 2.2) it 
follows that subtracting a rank-1 matrix of the means ¥,(Xi) from matrices 
T n and H n does not affect the asymptotic distribution of the eigenvalues. 

For a fixed u > 0, denote 




m(u) =EX 1 I { \ Xl \ > u} 
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and let 

a 2 (u) = EXf I{|Xi|<«} - rn 2 (u). 

Clearly, a 2 (u) < 1 and since E(Xi) = 0, E(X 2 ) = 1, we have m(u) — > and 
cr(u) — ► 1 as « -> oo. 
Let 

Xi =Xi7 { | Xl | >u} -m(u). 
Notice that c 2 (u) = E(Xi — X\) 2 , therefore the bounded random variable 

, _ Xi -Xi 
o-(u) 

has mean zero and variance 1. Denote by T^,H^ the corresponding Toeplitz 
and Hankel matrices constructed from the independent bounded random 
variables 

x i . = X 3 ~ X 3 
J a(u) 

distributed as X[. By the triangle inequality for cJbl(v) an d (2.16), 

4 L (A(T n /v^),A(T^/v^)) 

< 24 L (/i(T n /v^),/i(a(«)T , n /v^)) + 2d BL (/i(T;/ v ^),/i( ( 7( W )T / n / v ^)) 

< 4tr((T n - a(u)T' n ) 2 ) + -|(1 _ a(n)) 2 tr((T^) 2 ). 

It is easy to verify that E(X 2 ) = 1 — cr 2 (u) — 2m(u) 2 and that with proba- 
bility 1 

(4.1) \ tr((T n - a(u)T' n ) 2 ) = -X 2 + I £(l - 3 -)x 2 -> E(X 2 ), 

as n — ► oo (e.g., sandwiching the coefficients j/n between the piecewise con- 
stant £~ 1 [£j/n\ and ^ _1 [~£j/n] allows for applying the strong law of large 
numbers, with the resulting nonrandom bounds converging to E(X 2 ) as 
I — > oo). Similarly, 

(4.2) -1 tr((T'J 2 ) = ±(X>) 2 + -f:(l- J -) (Xj) 2 - E((X[) 2 ). 

For large u, both m(w) and 1 — cr(u) are arbitrarily small. So, in view of (4.1) 
and (4.2), with probability 1 the limiting distance in the bounded Lipschitz 
metric (Ibl between /i(T n /\/n) and /i(T^/y / n) is arbitrarily small, for all 
u sufficiently large. Thus, if the conclusion of Theorem 1.1 holds true for all 
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sequences of independent bounded random variables with the same 

limiting distribution 77^ then /}(T n /-y/n) must have the same weak limit 
with probability 1. 
Similarly, we have 

4 L (A(H n /^),/i(H;/v^)) 

< - 2 tr((H n - a(u)W n ) 2 ) + A(l _ a{u) f tr((H^) 2 ). 
By the same argument as before, with probability 1 

- 2 tr((H n - a(u)W n ) 2 ) = - E f 1 - ^ )> 

and n _2 tr((H^) 2 ) — > E((X{) 2 ). Therefore, with probability 1 the limiting 
c?BL-distance between /t(H n /-y/n) and fl(H.' n /y/n) is arbitrarily small for 
large enough u. 

Similarly, denoting by M n ,,M^ the corresponding Markov matrices con- 
structed from the independent bounded random variables Xij and X-j := 

X ij-Xij we naV g 

4 L (A(M n /v^),A(M;/v^)) < 4jtr(M 2 ) + A(i _ CT (u)) 2 tr((r<) 2 ). 

By (2.18), with probability 1, n~ 2 tr((M' n ) 2 ) -► 2 and n~ 2 tr(M 2 ) -> 2E(X 2 2 ). 
Therefore, with probability 1, the limiting (ieL-distance between /i(M n /y / n ) 
and p,(M.' n / y/n) is arbitrarily small for large enough u. □ 

4.2. Combinatorics for Hankel and Toeplitz cases. For k,n £ N, consider 
circuits in {1, ... ,n} of length L(tt) = /c, that is, mappings 7r : {0, 1, . . . , k} — > 
{1,2,..., n}, such that 7r(0) = 7r(fc). 

Let s : N 2 — > N be one of the following two functions: st(x, y) = \x — y\ , or 
sh(x, y) = x + y. We will use s to match (i.e., pair) the edges (7r(i — 1), n(i)) 
of a circuit 7T. The main property of the symmetric function s is that for a 
fixed value of s(m,n), every initial point m of an edge determines uniquely 
a finite number (here, at most 2) of the other end-points: if k, m S N, then 

(4.3) #{y£N:s(m,y) = k}<2. 

For a fixed s as above, we will say that circuit ir is s-matched, or has 
self-matched edges, if for every 1 < i < L(tt) there is j 7^ i such that s(7r(i — 
l),7r(i))= S (vr(i-l),7r(j)). 

We will say that a circuit ir has an edge of order 3, if there are at least 
three different edges in ir with the same s-value. 

The following proposition says that generically self-matched circuits have 
only pair-matches. 
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Proposition 4.2. Fix r € N. Let N denote the number of s -matched 
circuits in {1, ... ,n} of length r with at least one edge of order 3. Then 
there is a constant C r such that 

AT<C r n L(r+1)/2j . 
In particular, as n — > oo we have nl + r / 2 ~^ 0- 

Proof. Either r = 2k is an even number, or r = 2k— 1 is an odd number. 
In both cases, if an s-matched circuit has an edge of order 3, then the total 
number of distinct s-values 

{s(ir(i-l),ir(i)):l<i<L(ir)} 

is at most k— 1. We can think of constructing each such circuit from the left 
to the right. First, we choose the locations for the s-matches along {1, . . . , r}. 
This can be done in at most r! ways. Once these locations are fixed, we 
proceed along the circuit. There are n possible choices for the initial point 
7r(0). There are at most n choices for each new s-value, and there are at most 
two ways to complete the edge for each repeat of the already encountered s- 
value. Therefore there are at most r! x n x n k ~ 1 2 r+1 ~ k < C r n such circuits. 
□ 

We say that a set of circuits ^1,^2,^3,^4 is matched if each edge of any 
one of these circuits is either self-matched, that is, there is another edge of 
the same circuit with equal s-value, or is cross-matched, that is, there is an 
edge of the other circuit with the same s-value (or both). 

The following bound will be used to prove almost sure convergence of 
moments. 

Proposition 4.3. Fix r £ N. Let N denote the number of matched 
quadruples of circuits in {!,..., n} of length r such that none of them is 
self-matched. Then there is a constant C r such that 

N<C r n 2r+2 . 

Proof. First observe that there are at most 2r distinct s-values in the 
4r edges of matched quadruples of circuits of length r. Further, the number of 
quadruples of such circuits for which there are exactly u distinct s-values is at 
most C r:U n u+4 . Indeed, order the edges (irj(i — l),7Tj-(i)), of such quadruples 
starting at j = 1, i = 1, then i = 2, . . . ,r, followed by j = 2, i = 1 and then 
i = 2, . . . ,r, and so on. There are at most u 4r possible allocations of the 
distinct s-values to these 4r edges, at most n 4 choices for the starting points 
^(O), ^2(0), 713(0) and 7T4(0) of the circuits and at most n u for the values 
of TTj(i) at those (j,i) for which (jrj(i — l),7r,-(£)) is the leftmost occurrence 
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of one of the distinct s-values. Once these choices are made, we proceed to 
sequentially determine the mapping ni(i) from i = to i = r, followed by 
the mappings ^2,^3,^4, noting that by (4.3) at most 2 4r_u_4 quadruples 
can be produced per such choice. 

Recall that the number of possible partitions V of the 4r edges of our 
quadruple of circuits into \V\ distinct groups of s-matching edges, with at 
least two edges in each group, is independent of n. Thus, by the preced- 
ing bound it suffices to show that for each partition V with \V\ £ {2r — 
l,2r} such that each circuit shares at least one s-value with some other 
circuit, there correspond at most Cn 2r+2 matched quadruples of circuits in 
{1, . . . , n}. To this end, note that \V\ =2r implies that each s-value is shared 
by exactly two edges, while when \V\ = 2r — 1 we also have either two s- 
values shared by three edges each or one s-value shared by four edges (but 
not both). 

Fixing hereafter a specific partition V of this type, it is not hard to check 
that upon re-ordering our four circuits we have an s-value that is assigned to 
exactly one edge of the circuit tti, denoted hereafter (tti(z* — 1), 7Ti (**)), and 
in case \V\ = 2r, we also have another s-value that does not appear in tx\ and 
is assigned to exactly one edge of 7^, denoted hereafter (^(j* — 1), ^(j*)). 
(Though this property may not hold for all ordering of the four circuits, an 
inspection of all possible graphs of cross-matches shows that it must hold 
for some order.) 

We are now ready to improve our counting bound for the case of \V\ = 
2r — 1 , by the following dynamic construction of tt\ : 

First choose one of the n possible values for the initial value vri(0), 
and continue filling in the values of vri(i), i = 1,2, — 1. Then, start- 
ing at 7Ti(r) = vri(O), sequentially choose the values of m(r — l),7ri(r — 
2), . . . , thus completing the entire circuit tt\. This is done in accor- 
dance with the s-matches determined by V, so there are n ways to complete 
an edge that has no s-match among the edges already constructed, while by 
(4.3) if an edge is matching one of the edges already available, then it can 
be completed in at most two ways. Since this procedure determines uniquely 
the edge (711 (i* — 1), tti(**)) and hence the s-value assigned to it, it reduces 
to 2r — 2 the number of s-matches that can each independently assume 0{n) 
values. Consequently, the number of quadruples of circuits corresponding to 
V is at most Cn 2r+2 . 

In case \V\ = 2r, we first construct tti by the preceding dynamic con- 
struction while determining the s-value for the edge (7Ti(i* — l),7Ti(i*)) out 
of the circuit condition for w±. Then, we repeat the dynamic construction for 
7T2, keeping it in accordance with the s-values determined already by edges 
of tti and uniquely determining the edge (^2(7* — l),7T2(i*)) an d hence the 
s-value assigned to it, by the circuit condition for tt2- Thus, we have again 
reduced the total number of s-matches that can each independently assume 
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0(n) values to 2r — 2, and consequently, the number of quadruples of circuits 
corresponding to V is again at most Cn 2r+2 . □ 

The next result deals only with the slope matching function st{x,u) = 
\x - y\. 

Proposition 4.4. Fix fceN. Let N be the number of st -matched cir- 
cuits it in {1, ... ,n} of length 2k with at least one pair of st -matched edges 
(tt(«— 1), vr(i)) and (ir(j — l),ir(j)) such that Tr(i) — Tr(i — 1) +vr(j) — 7r(j — 1) ^ 
0. Then, as n — > oo we have 



Proof. By Proposition 4.2, we may and shall consider throughout path 
7r in {1, ...,n} of length 2k for which the absolute values of the slopes 
7r(z) — ir(i — 1) take exactly k distinct nonzero values and, for tt to be a 
circuit, the sum of all 2k slopes is zero. Let V denote a partition of the 2k 
slopes to ST-matching pairs, indicating also whether each slope is negative 
or positive, with m('P) denoting the number of such pairs for which both 
slopes are positive. Observe that if under V both slopes of some s-r-matching 
pair are negative, then necessarily m{T > ) > 1, for otherwise the sum of all 
slopes will not be zero for any path corresponding to V . Thus, it suffices to 
show that at most n k circuits tt correspond to each V with m = m(V) > 1. 
Indeed, fixing such V, there are at most n ways to choose 7r(0) and n k ~ m 
ways to choose the k — m pairs of slopes for which at least one slope in each 
pair is negative. The remaining m pairs of s^-matching positive slopes are 
to be chosen among {1, . . . , n} subject to a specified sum (due to the circuit 
condition). Since there are at most n m ways for doing so, the proof is 
complete. □ 

4.3. Moments of the average spectral measure. 



Proposition 4.5. Suppose {Xj} is a sequence of bounded i.i.d. random 
variables such that E(X X ) = 0,E(Xf) = 1. Then fork£N 



n 



(fc+i)Ar^ . 



(4.4) 



lim — 

n-^oo fi 1 



1 



Etv(T 2 n k )= £ PT (w) 



k+l 



w : \w\=2k 



and 



(4.5) 



Etr(Tf - 1 ) = 0. 
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Proof. For a circuit it : {0, 1, . . . , r} — > {1, 2, . . . , n} write 

r 

(4-6) X t = II ^r(i)-7r(i-l) • 



i=l 



Then 
(4.7) 



Etr(T;)=]TEX^ 



where the sum is over all circuits in {1, . . . , n} of length r. 

By Holder's inequality, for any finite set II of circuits of length r 



(4.8) 



?ren 



<E(|AT)#n. 



Since \X\ r is bounded, we can use the bound (4.8) to discard the "non- 
generic" circuits from the sum in (4.7). To this end, note that since the 
random variables {Xj} are independent and have mean zero, the term EX^ 
vanishes for every circuit ir with at least one unpaired Xj. Since T n is a 
symmetric matrix, by (4.6) paired variables correspond to the slopes of the 
circuit tt which are equal in absolute value. Hence, the only circuits that 
make a nonzero contribution to (4.7) are those with matched absolute val- 
ues of the slopes. This fits the formalism of Section 4.2 with the matching 
function st(x,u) = \x — y\. 

If r = 2k — 1 > is odd, then each s^-matched circuit tt of length r must 
have an edge of order 3. From (4.8) and Proposition 4.2 we get |Etr(T^ fc_1 )| < 
Cn k , proving (4.5). 

When r = 2k is an even number, let n be the set of all circuits tt : {0, 1, . . . , 
2k} — ► {1, . . . , n} with the set of slopes {vr(i) — n(i — 1) : i = 1, . . . , 2k} con- 
sisting of k distinct nonnegative integers s±,...,Sk and their counterparts 
— s±, . . . , — Sfc. From (4.8) and Proposition 4.4 it follows that 



lim 

n — >oo ji 



1 



Etr(T;) 



Treli 



0. 



Moreover, for every circuit tt £ n, if Xj enters the product X„-, then it 
occurs in it exactly twice, resulting with EX,,- = 1, and consequently with 

EX„- = #11. Therefore, the following lemma completes the proof of (4.4), 
and with it, that of Proposition 4.5. □ 



Lemma 4.6. 



lim 

n — >oo fi 



fc+ 



where the sum is over the finite set of partition words w of length 2k. 
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Proof. The circuits in II can be labeled by the partition words w of 
length 2k which list the positions of the pairs of s^-matches along {1, . . . , 2k}. 
This generates the partition II = \J W IL(w) into the corresponding equiva- 
lence classes. 

To every such partition word w we can assign n k+1 paths 7r(i) = Xj, 
i = 0, ... ,2k, obtained by solving the system of equations (3.2), with val- 
ues 1,2, ... ,n for each of the k + 1 undetermined variables, and the remain- 
ing k values computed from the equations [which represent the relevant 
ST-matches for any tt S n(u;)]. Some of these paths will fail to be in the ad- 
missible range {1, . . . ,n}. Let p n {w) be the fraction of the n k+1 paths that 
stay within the admissible range {1, . . . ,n}, noting that by Proposition 4.2, 
p n (w) - n-( fc+1 )#n(w) -> 0. 

Interpreting the undetermined variables Xj as the discrete uniform in- 
dependent random variables with values {l,2,...,n}, p n {w) becomes the 
probability that the computed values stay within the prescribed range. As 
n — > oo, the k + 1 undetermined variables Xj/n converge in law to indepen- 
dent uniform U[0, 1] random variables Uj. Since p n (w) is the probability of 
the (independent of n) event A w that the solution of (3.2) starting with 
xj/n G {1/n, 2/n, . . . , 1} has all the dependent variables in (0, 1], it follows 
that p n {w) converges to px{w), the probability of the event A w that the 
corresponding sums of independent uniform U[0, 1] random variables take 
their values in the interval [0, 1]. □ 

Next we give the Hankel version of Proposition 4.5. 

Proposition 4.7. Let {Xj} be a sequence of bounded i.i.d. random vari- 
ables such that E(Xi) = 0,E(X 1 2 ) = 1. For fcsN, 

(4.9) lim^EtrtH^H £ p H (w) 

w : \w\=2k 

and 

Proof. We mimic the procedure for the Toeplitz case. For a circuit 
7r: {0, 1, . . . ,r} — ► {1, 2, . . . , n} write 

r 

(4-11) Xtt = X n ty + n{i-l) ■ 

i=l 

As previously, 



(4.12) 



Etr(H;;) = ]TEX. 

7T 
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where the sum is over all circuits in {1, . . . , n} of length r, and by Holder's 
inequality, we again have the bound (4.8), which for bounded \X\ r we use 
to discard the "nongeneric" circuits from the sum in (4.12). To this end, 
with the random variables Xj independent and of mean zero, the term EX^ 
vanishes for every circuit tt with at least one unpaired Xj. By (4.11), in 
the current setting paired variables correspond to an s#-matching in the 
circuit tt. Hence, only s#-matched circuits (in the formalism of Section 4.2) 
can make a nonzero contribution to (4.12). 

If r = 2k — 1 > is odd, then each s //-matched circuit tt of length r must 
have an edge of order 3. From (4.8) and Proposition 4.2 we get |Etr(H ? 2 ^~ 1 )| < 
Cn k , proving (4.10). 

When r = 2k is an even number, let n be the set of all circuits tt : {0, 1, . . . , 
2k} — ► {1, . . . , n} with the s#-values consisting of k distinct numbers. Recall 
that EX^ = 1 for any tt Ell [see (4.11)]. Further, with any s#-matched cir- 
cuit not in n having an edge of order 3, it follows from (4.8) and Proposition 
4.2 that 

lim ^|Etr(H;)-#n|=0. 

n— too fi K + L 

Therefore, the following lemma completes the proof of (4.9), and with it, 
that of Proposition 4.7. □ 

Lemma 4.8. 

lim - ttt #n = V p H (w). 

n— >oo n K + L * — ' 

w : \w\=2k 

Proof. Similarly to the proof of Lemma 4.6, label the circuits in n 
by the partition words w which list the positions of the pairs of s //-matches 
along {1, . . . , 2k}, with the corresponding partition n = IJ^ U(w) into equiva- 
lence classes. To every such partition word w we can assign n k+l paths 7r(«) = 
Xi, i = 0, . . . , 2k, obtained by solving the system of equations (3.6), with val- 
ues 1,2, ... ,n for each of the k + 1 undetermined variables, and the remain- 
ing k values computed from the equations. Some of these paths will fail to 
be a circuit, and some will fail to stay in the admissible range {1, . . . ,n}. 
Let p n (w) denote the fraction of the paths that stay within the admissible 
range {l,...,n} and are circuits, noting that p n (w) — n~( k+1 ^#Tl(w) — > 
by Proposition 4.2. Thus, p n {w) is the probability of the event A w that the 
solution of (3.6) starting with the undetermined variables Xj that are in- 
dependent discrete uniform random variables on the set {1/n, 2/n, . . . , 1}, 
stays within (0, 1] and satisfies the additional condition xq = xik- It follows 
that as n — ► oo, the probabilities p n (w) converge to pn{w), the probability 
of the event A w with the undetermined variables now being independent 
and uniformly distributed on [0, 1]. □ 
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4.4. Concentration of moments of the spectral measure. 

Proposition 4.9. Let {Xj} be a sequence of bounded i.i.d. random vari- 
ables such that E(X X ) = and E(Xf) = 1. Fix r G N. Then there is C r < oo 
suc/i that for all n E N we /iat>e 



2r+2 



E[(tr(T;)-Etr(T;)) 4 ] < C r n Ir+I and E[(tr(H^) -Etr(H;)) 4 ] < C r n 

Proof. The argument again relies on the enumeration of paths. Since 
both proofs are very similar, we analyze only the Hankel case. 
Using the circuit notation of (4.11) we have that 

(4.13) E[(tr(H;;)-Etr(l0 4 ]= £ E 



n(x^.-E(x,.)) 



where the sum is taken over all circuits irj , j = 1, . . . , 4 on {1, . . . , re} of length 
r each. With the random variables Xj independent and of mean zero, any 
circuit 7T/C which is not matched together with the remaining three circuits 
has E(X 7r J = and 



E 



H(X 7r ,-E(X 7rj )) 



E 



x^np^-E^.)) 



0. 



Further, if one of the circuits, say 7Ti, is only self-matched, that is, has no 
cross-matched edge, then obviously 



E 



n(x 7r3 -E(x 7r3 )) 



E[X 7ri -E(X 7r .)]E 



n(x 7r3 -E(x 7r3 )) 

■i=2 



0. 



Therefore, it suffices to take the sum in (4.13) over all s#-matched quadru- 
ples of circuits on {l,...,n}, such that none of them is self-matched. By 
Proposition 4.3, there are at most C r n 2r+2 such quadruples of circuits, and 
with |X| (hence |X ff |) bounded, this completes the proof. □ 

4.5. Proofs of the Hankel and Toeplitz cases. 

PROOF of Theorem 1.1. Proposition 4.1(i) implies that without loss 
of generality we may assume that the random variables {Xj} are centered 
and bounded. 

By Proposition 4.5 the odd moments of the average measure E(/i(T n /y / n )) 
converge to 0, and the even moments converge to m 2 k of (3.5). By Cheby- 
shev's inequality we have from Proposition 4.9 that for any 5 > and 

k,n GN, 



J x fc dA(T n /v^)- J x k dE(fi(T n /^i)) 



>S 



< C h 5- A n- 2 . 
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Thus, by the Borel-Cantelli lemma, with probability 1 J x d(l(T n /^/n) — > 
J x k djT as n — > oo, for every k € N. In particular, with probability 1, the 
random measures {/i(T n /^n)} are tight, and since the moments determine 
7T uniquely, we have the weak convergence of p,(^T n / y/n) to jt- 

Since the moments do not depend on the distribution of the i.i.d. sequence 
{Xj}, the limiting distribution does not depend on the distribution of 
X either, and is symmetric as all its odd moments are zero. By Proposition 
A.l, it has unbounded support. □ 

Proof of Theorem 1.2. We follow the same line of reasoning as in 
the proof of Theorem 1.1, starting by assuming without loss of generality 
that {Xj} is a sequence of centered and bounded random variables, in view 
of Proposition 4. 1 (i) . Then, by Proposition 4.7, as n — > oo the odd moments 
of the average measure E(/i(H n ,/y / n)) converge to 0, and the even moments 
converge to m,2k of (3.8), whereas from Proposition 4.9 we conclude that 
with probability 1 the same applies to the moments of £l(~H. n /^/n). The al- 
most sure convergence / x k d(x(Yl n / yjn) — > J x k djn asm oo, for all k G N, 
implies tightness of fi(H n /\/n) and its weak convergence to the nonrandom 
measure 7#. Since its moments do not depend on the distribution of the 
i.i.d. sequence {Xj}, so does the limiting distribution jh, which is symmet- 
ric since all its odd moments are zero. By Proposition A. 2 it has unbounded 
support, and is not unimodal. □ 

4.6. Markov matrices with centered entries. In view of Proposition 4.1(h) 
we may and shall assume hereafter without loss of generality that the random 
variables X\j are bounded. Our proof of Theorem 1.3 follows a similar outline 
as that used in proving Theorems 1.1 and 1.2, where the combinatorial 
arguments used here rely on matrix decomposition. 

Starting with some notation we shall use throughout the proof, let T n be 
a graph whose vertices are two-element subsets of {1, . . . ,n} with the edges 
between vertices a and b if the sets overlap, a n b ^ 0. We indicate that 
(a, b) is an edge of T n by writing a ~ b, and for a G T n let a = {a~ , a + } with 
1 < a - < a + < n. 

The main tool in the Markov case is the following decomposition: 

M n = J2 X aQa,a, 

aer n 

where X a := X a + a - and Q a ,b is the n x n matrix defined for vertices a, b of 
r n by 

!— 1, if i = a + ,j = b + , or i = a~,j = b~~ , 
1, if i = a+,j = b~, or i = a~ , j = b+, 
0, otherwise. 
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t 



Let t a fi = tr(Q 0) (,). It is straightforward to check that 

— 2, if a = b, 

— 1, if a / b and a~ = b~ or a + = 6 + , 
1, if a~ = b + or a + = b~ , 

, 0, otherwise. 

From this, we see that i Qj b = tb, a - Since it is easy to check that Q a ,t x Qc,d 
ife, c Qa,,i, we get 



a, 6 



r 



(4.14) tr(Q aij01 

where for convenience we identified a r +i with ai. 

For a circuit 7r = (ai ~ • • • ~ a r ~ ai) of length r in T n let 



r r 



(4.15) x^nwxii^ 

It follows from (4.14) and (4.15) that 

(4.16) tr(M;) = ^ X „ 

7T 

where the sum is over all circuits of length r in r^, leading to the Markov 
analog of the path expansion (4.7), 

(4.17) Etr(M^) =^EX^. 

7T 

We say that a circuit tt = (a\ ~ • • • ~ a r ~ ai) of length r in r n is vertex- 
matched if for each i = 1, . . . , r there exists some j ^ i such that a» = aj , and 
that it has a match of order 3 if some value is repeated at least three times 
among (aj,j = l,...,r). Note that the only nonvanishing terms in (4.17) 
come from vertex-matched circuits. 

In analogy with Proposition 4.2, we show next that generically vertex- 
matched circuits have only double repeats, and consequently, the odd mo- 
ments of EfL(M. n /T,/n) converge to zero as n — > oo. 

Proposition 4.10. Fix r £ N. Let N denote the number of vertex- 
matched circuits in T n with r vertices which have at least one match of 
order 3. Then there is a constant C r such that for all n £ N 

iV<CVnL(r+i)/2j. 

Proof. Either r = 2k is even, or r = 2k — 1 is odd. In both cases, the 
total number of different vertices per path is at most k — 1 . Since a\ ~ ai ~ 
• • • ~ a r , there are at most n 2 /2 choices for ai, and then at most 4n choices 
for each of the remaining k — 2 distinct values of aj , and one choice for each 
repeated value. Thus N < A r n 2 x n k ~ 2 = Cn k . □ 
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Corollary 4.11. Suppose {Xij; j > % > 1} are bounded i.i.d. random 
variables such that E(A^i2) =0, E(X 2 2 ) = 1- Then, 

(4.18) Mm -^Etr(M^) = 0. 

Proof. If EX,,- is nonzero, then all the vertices of the path a\ ~ ai ~ 
• • • ~ a2k-\ must be repeated at least twice. So for an odd number of vertices, 
there must be a vertex which is repeated at least three times. Thus, by 
Proposition 4.10 and the boundedness of \X^\ and of t a ,b, 

lEtxiM™- 1 )] <C k n k , 

and (4.18) follows. □ 

Let W n = n 1 / 2 Z n + X n + £I n , where X n is a symmetric n x n matrix 
with i.i.d. standard normal random variables (except for the symmetry 
constraint), Z n = diag(Zjj)i<j<„, with i.i.d. standard normal variables Za 
that are independent of X n and £ is a standard normal, independent of 
all other variables. A direct combinatorial evaluation of the even moments 
of E/i(M n /y / n) is provided in [6]. We follow here an alternative, shorter 
proof, proposed to us by O. Zeitouni. The key step, provided by our next 
lemma, replaces the even moments by those of the better understood matrix 
ensemble W n . 



Lemma 4.12. Suppose {Xij;j > i > 1} is a collection of bounded i.i.d. 
random variables such that E(Xi2) = 0,E(A^ 2 ) = 1- Then, for every k £ N, 

(4.19) li^n- {k+1) [Etr(M 2 n k ) - Etv(W 2 n k )\ = 0. 

Proof. First observe that by Proposition 4.10, we may and shall as- 
sume without loss of generality that {Xy} is a collection of i.i.d. standard 
normal random variables, subject to the symmetry constraint Xij = Xji [as 
such a change affects n~^ fc+1 ^Etr(M 2fc ) by at most C^n" 1 ]. Recall the rep- 
resentation M n = X n — D n of (1.3) and let M n = X n — where is 
obtained by omitting the last row and column of the diagonal matrix D n+ i 
which is an independent copy of D n+ i that is independent of X n . Observe 
that the diagonal entries of — are jointly normal, of zero mean, vari- 
ance n + 1 and such that the covariance of each pair is 1. Therefore, with 

~ (n) — ' 

— independent of X n , for each n, the distribution of M n is exactly the 
same as that of W n . Consequently, (4.19) is equivalent to 

(4.20) Jim o n-( fc+1 )E[tr(M2 fc ) - tr(Mj*)] = 0. 
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The first step in proving (4.20) is to note that by a path expansion similar 
to (4.17) we have that 

(4.21) E[tr(M2 fc ) - tr(M^)] = ^TpEM^ - EMJ, 

7T 

where now the sum is over all circuits ir : {0, . . . , 2k} — > {1, . . . , n}, and 

2fe 

M?r = II Ajr 7r(i-l),7r(i) 
i=l 

with the corresponding expression for M,,-. Set each word w of length 2k 
to be a circuit by assigning w[0] = w[2k] and let H(w) denote the collec- 
tion of circuits tt such that the distinct letters of w are in a one-to-one 
correspondence with the distinct values of tt. Let v = v(w) be the num- 
ber of distinct letters in the word w, noting that #H(w) < n"^"' and that 
EM,r — EM^ = f n (w) is independent of the specific choice of tt £ H(w). 
Hence, taking the letters of w to be from the set of numbers {1, 2, . . . , 2k} 
with the convention that w(i) = w[i], we identify w as a representative of 
tt € H(w) (recall w[0] = w[2k]). For example, w = abbe of v(w) = 3 distinct 
letters becomes w = 1223 which we identify with the circuit tt G U(w) of 
length 4 consisting of the edges {1,2}, {2,2}, {2,3} and {3,1}. In view of 

(4.21) , we thus establish (4.20) by showing that for any w, some C w < oo 
and all n, 

(4.22) |/ n («;)| = \EM W - EM W \ < C w n k - V ^ +1 ' 2 . 

Let q = q(w) be the number of indices 1 < i < 2k for which w[i] =w[i — 1] 
[e.g., g(1223) = 1]. It is clear from the definition of M n and M n that f n ( w ) 
only if q(w) > 1. Let u = u(w) count the number of edges of distinct 
endpoints in w, namely, with {w[i — l],w[i]} G T n , which appear exactly 
once along the circuit w [e.g., n(1223) = 3]. Then, by independence and 
centering we have that EM W = as soon as u(w) > 1, whereas it is not 
hard to check that if u(w) > q(w), then also EM^ = 0. Thus, it suffices to 
consider in (4.22) only circuits w with q(w) > u(w). 

It is not hard to check that excluding the q loop-edges (each connecting 
some vertex to itself), there are at most k + \_(u — q)/2\ distinct edges in 
w. These distinct edges form a connected path through v(w) vertices, which 
for u > 1 must also be a circuit. Consequently, for any of the words w we 
are to consider, 

(4.23) v(w) <k + l u(w)=0 + l(u(w) - q(w))/2\ < k. 

Proceeding to bound \f n (w)\, note that any contribution which grows 
with n must come from the q diagonal entries of M n and M n which are 
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encountered according to the circuit w. Suppose first that u > 1, in which 
case f n {w) = EM tu . Computing the latter, upon expanding the sums in the 
q relevant diagonal entries of D n = diag(^? = i-^y), we must assign specific 
choices to at least u of the resulting "free" indices j\, . . . , j q E {1, . . . , n} in 
order to match all u unmatched edges of w of the form {w[i — l],w[i]} E T n . 
Indeed, by independence and centering, every other term of this expansion 
has zero expectation. After doing so, as each diagonal entry of D n is nor- 
mal of mean zero and variance n, we conclude by Holder's inequality that 
\fn{w)\ < C^n^-")/ 2 . By our bound (4.23) on v(w), this implies that (4.22) 
holds. 

Consider next words w for which u(w) =0 and let ai,...,a q be the q 
vertices for which {ai,a,i} is an edge of the circuit w. Let Ma = Qi — Si 
and Ma = Q { - Si, for i = 1, . . . , 2k, where Qi = Xa - Y$Li x ij, Qi = x u 

V 



X 



2k+i ^-ij with the corresponding expressions 



i,n+i ~ 2~Zj=i x ij and Si = Y]j 
for Si. Note that we may and shall replace each Si by Si without altering 
EM,,,, and since the off-diagonal entries of M n and M n are the same, we 
have that 



Li 



fn(w) = E 



1 

u 

1=1 



U(Qa i -S ai )-l[(Q ai -S ai ) 



i=l 



i=l 



i-1 



i=i 



j=i+l 



where L w is the product of the (2k — q) off-diagonal entries of M n that 
correspond to the edges of w that are in T n . Since the jdistribution of 
(L w , {Qi}, {Qi}) is independent of n > 2k, while Ma and Ma are normal 
of mean zero and variance at most n + 2, it follows by Holder's inequality 
that |/„(«;) | < (7 w n(«H-i)/2 ) whic h by (4.23) results with (4.22). 

As already seen, (4.22) implies that (4.20) holds and hence the proof of 
the lemma is complete. □ 



Let 7o(cte) = |f \/4 — x 2 l| x |< 2 denote the semicircle distribution, 71 (dx) 



-j== exp(— x /2) denote the standard normal distribution and let 7^ = 70 E 
71 be the corresponding free convolution. In view of Lemma 4.12, our next 
result shows that the even moments of E/i(M n /y / n) converge as n — > 00 to 
those of 7m- 



Proposition 4.13. For every fcsN, 
(4.24) lim n^ (fc+1) Etr(W^) 



x 2k d"f M - 
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PROOF. Let A n = Z n + n~ l / 2 £l n , so n _1 / 2 W„ = A n + n _1 / 2 X n . By the 
strong law of large numbers, with probability 1, fi(A n ) — > 71 weakly. Further, 
sup n E J \x\ d£i(A n ) < 00, and E J \x\ d/2(n _1//2 X n ) < rT 1 v /Etr(X, 2 t ) = 1, im- 
plying by Pastur and Vasilchuk ([17], Theorem 2.1 and page 280), that 
/i(W n /y / n) converges weakly to 7m, in probability. It follows that for any 
and all r < 00 , 

(4.25) lim E / h r (x) da(W n /y/n) = [ h r (x)dj M , 

n-*oo J J 

where h r (x) = (min(|x|,r)) 2fc . Recall that all moments of 7m are finite (cf. 
Proposition A. 3), so as r — ► 00 the right-hand side of (4.25) converges to 
/ x 2k djM- It is n °t hard to check that for any k £ N, 

E / x 2k dfi(W n /^) = n" (fc+1) Etr(W 2fc ) 

is bounded in n by some Ck < 00. Hence, for all n, 



"( fc+1 )Etr(W 2fc )-E J h r {x)dfi(W n 
and (4.24) follows by considering r — > 00 in (4.25). □ 



k+ir 



We next derive the analog of Proposition 4.3 and similarly to Proposition 
4.9, get as a result the concentration of moments of fi(M. n /y/n) around 
those of E(/}(M n /V^)). 

Proposition 4.14. Fix r £ N. Let N denote the number of vertex- 
matched quadruples of circuits in with t vertices each, such that tiotic 
of them is self- matched. Then there is a constant C r such that 

N<C r n 2r+2 . 

Proof. Let V denote the partition of the 4r vertices of the circuits 
7Ti,...,7T4 in r n to \V\ < 2r distinct groups of matching vertices, with at 
least two elements in each group, while having each circuit cross-matched 
to at least one of the other circuits. As part of V we specify also which 
of the four types of edges to use in each connection along the circuits. For 
i = 1,2,3,4, let Ui = Ui(V) be the number of distinct vertices in 7Tj that do 
not appear in any ttj, j < i. There are at most n 1+Ul ways to choose the 
circuit 7Ti in agreement with V, that is, n 2 /2 ways to choose the vertex a\ of 
tti and at most n ways for each of the remaining u\ — l distinct vertices of 717 . 
For i = 2,3,4, per given Wj, j < i, the same procedure shows that there are 
at most n 1+Ui ways to complete the circuit 7Tj. Further, if 77 is cross-matched 
to iTj for some j <i, then starting the completion of 77 at a vertex that we 
already determined by such a cross-match, we have that there are only n Ui 
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ways to complete 7Tj. The latter improved bound always applies for i = 4, 
and it is not hard to check that upon re-ordering the four circuits, we can 
assure that it applies also for i = 3. We thus get at most n u+2 quadruples 
of circuits per choice of V, where u = J2i u i = \'P\<2r, yielding the stated 
bound. □ 

Proposition 4.15. Suppose {Xij; j > % > 1} is a collection of bounded 
i.i.d. random variables such that E(Xi2) = and K(Xf 2 ) = 1- For any r £ N, 
there exists C r < oo such that E[(tr(M£) - Etr(M;)) 4 ] < C r n 2r+2 for all 
n £ N. 

Proof. By (4.16) we have the Markov analog of (4.13) 



where the sum is taken over all circuits ttj, j = 1, . . . , 4, in r^, each having r 
vertices. With the random variables {Xij\n >j>i>l} independent and of 
mean zero, just like the proof of Proposition 4.9, it suffices to take the sum 
in (4.26) over all vertex-matched quadruples of circuits on T n , such that 
none of them is self-matched. Since \X\ (and hence (X^l) is bounded the 
stated inequality follows from the bound of Proposition 4.14 on the number 
of such quadruples. □ 

Proof of Theorem 1.3. The proof is very similar to that of The- 
orems 1.1 and 1.2, where by Proposition 4.1(h), we may and shall assume 
that {Xij; j > i > 1} is a collection of i.i.d. bounded random variables. Then, 
by (4.18) the odd moments of the average measure E(/t(M n / \/n)) converge 
to 0, and by Proposition 4.13 the even moments converge to those of 7m, 
whereas from Proposition 4.15 we conclude that with probability 1 the same 
applies to the moments of /t(M n / v / n). By Proposition A. 3, 7 m is a symmet- 
ric measure of bounded smooth density that, though of unbounded support, 
is uniquely determined by its moments (having in particular zero odd mo- 
ments). Hence, the almost sure convergence / x k dfl(M. n /^/n ) — > f x k d'jM 
as n — ► oo, for all k G N, implies the weak convergence of /i(M n /y / n) to jm- 
□ 



A.l. Properties of 7jf, -fM and 7^. In this section we establish prop- 
erties of the symmetric measures with moments given by (3.5), and (3.8) 
and the free convolution jm of Theorem 1.3. For proofs, it is convenient to 
express the volumes ph(w) and pt{w) as the probabilities that involve sums 



r 4 



(4.26) E[(tr(M£) - Etr(M^)) 4 ] 



£ E n(X 7ri -E(X 7r .)) 
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of independent uniform random variables. This can be done by setting the 
undetermined variables as the independent uniform U[0, 1] random variables 
Uq, U\, . . . , C/fc, expressing the dependent variables as the linear combinations 
of Uo,Ui, . . . ,Uk, and expressing the volumes as the probabilities that these 
linear combinations are in the interval /. For each partition word w of length 
2k with a nonzero volume p(w), this probability takes the form 

(A.l) P (w) = p^f| e [o, , 

where raj j are integers and M = k. 

Proposition A.l. A symmetric measure 7t with even moments given 
by (3.5) has unbounded support. 

Proof. It suffices to show that (m2k) — ► oo. Let w be a partition 
word of length 2k. Denoting Si = J2j n i,jUj — ^, i = 1, 2, . . . , k, we have 

(A.2) M«0=p(n{|Si|<|}V 

Since the coefficients riij in (A.l) take values 0,±1 only, and J2j n i,j = 1) 
each of the sums Si in (A.2) has the following form: 

L 

(A.3) S = (U a - 1/2) + - U j{j) ), 

3=1 

where ce,(3(j),^(j),j = 1,. . . ,L, are all different. Let Lj denote the number 
of independent random variables J7 in this representation for Si- Clearly, 
l<Li<k + l. 

Fixing e > let Uj = 1/2 + Vj/(e(k + 1)) for j = 0, . . . , k. For k > 1/e 
define the event 



j=0 

noting that conditionally on A, the random variables Vq, . . . ,Vk are indepen- 
dent, each uniformly distributed on [—1/2,1/2]. As under this conditioning 
the i.i.d. random variables {Vj} have symmetric laws, it is easy to check that 
for i = 1, . . . , k, the form (A.3) of Si implies that 



\Si\>l\A)=v[ 



Li 

E^3 

3=1 



> e(k + 1) /2^J = 2P (j^ v i > e ( k + 1) l^j 
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which by Markov's inequality is bounded above by 



2e -eHk+l)/2 (JKe eV ) L l=e 



-£ 2 (fc+l)/2 



oe/2 



-e/2\ L 



Since e * 2 ^ < e x / 2 for x > 0, and L, < + 1 



we deduce that 

(A.4) PflSil > <2exp(-e 2 (A; + l)/2 + e 2 L i /4) <2e" £2(fc+1)/4 , 



for i = l,...,k. As 2A;e _£2 ^ c+1 ) //4 < 1/2 for some fco = ko(e) < oo and all 
k > ko, it follows from (A. 2) and (A.4) that for all k>k$ and any word w 
of length 2k, 



(A.5) 



PT ( w )>^F(A) = Ue(k + l)r {k+1) . 



Since there are more than k\ partition words w of length 2k, this shows that 
for all large enough k we have 

m 2k >\k\{e{k + l))^ k+1) >(2,eY k . 

Hence, lim sup^^ w^fe — V(3 £ )- As e > is arbitrarily small, this com- 
pletes the proof. □ 

Proposition A. 2. A symmetric measure jh with even moments given 
by (3.8) is not unimodal and has unbounded support. 

Proof. Suppose that the symmetric distribution 7# is unimodal. Since 
all moments of 7# are finite, from Khinchin's theorem (see [14], Theorem 
4.5.1), it follows that if </>(t) = J e^'jnidx) denotes the characteristic func- 
tion of 7#, then g(t) = <p(t) +t(f>'(t) must be a characteristic function, too. 
The even moments corresponding to g(t) are (2k + l)m2k(7H) , arid must 
be a positive definite sequence, that is, the Hankel matrices with entries 
[(2(i + j) — 3)m 2 (j +J _2) (7_ff)]i<ij<n should all be nonnegative definite. How- 
ever, with 7714 = 2, 7776 = 11/2 and m% = 281/15, for 77 = 3 the determinant 



det 



1 3?772 57774 
37772 5?774 77776 
5t774 77776 9?778 



det 



1 3 10 
3 10 77/2 
10 77/2 843/5 



-73/20 



is negative. Thus, 777 is not unimodal. 

To show that the support of 777 is unbounded we proceed like in the 
Toeplitz case. The main technical obstacle is that some partition words 
contribute zero volume. We will therefore have to find enough partition 
words that contribute a nonzero volume, and then give a lower bound for 
this contribution. 
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We consider only moments of order 4k — 2, k > 2, and find the contribution 
of the partition words which have no repeated letters in the first half, that 
is, 

w [l) ^w[2) ^...^w\2k - 1]. 

That is, we consider the set of partition words w of length 4k — 2 of the 
form w = abc . . . with the first 2k — 1 letters written in the fixed (alphabetic) 
order, followed by the repeated letters a, b, c, . . . at positions 2k, . . . , 4k — 2. 
We also require that the repeats are placed at odd distance from the original 
matching letter. Formally, we consider the set of partition words w of length 
4k — 2 which satisfy the following condition: 

If w[a] = w[(3] and a < (3, then a ^ /3mod2, a < 2k — 1 and (3 > 2k. 

Since we can permute all letters at locations 2k, 2k + 2, ... ,4k — 2 and all 
letters at locations 2k + 1,2k + 3, ... ,4k — 3, clearly there are k\{k — 1)! such 
partition words. 

To show that all such partition words contribute a nonzero volume, we 
need to carefully analyze the matrix of the resulting system of equations (3.6). 
This is a (2k — 1) x (4k — 1) matrix with entries 0, ±1 only. The first 2k — 1 
columns of the matrix are filled in with the pattern of sliding pairs 1, 1 cor- 
responding to first occurrences of every letter, that is, the left-hand sides of 
equations (3.6) are simply 

{x + xi = • • • 

X\ + X2 = ... 

X2k-2 +X 2 k-l = .... 

So the first 2k columns of the matrix el l C cl S follows, with the star denoting 
as yet unspecified entries of the 2/cth column. 

1100. .00* 
0110. .00* 
0011. .00* 

0000. .11* 
0000. .011 

The remaining columns follows. In every even row of the second half 

we have disjoint (nonoverlapping) pairs (—1, —1), including the site adjacent 
to the "last letter," that has entry 1 in the last row, and entry —1 in one of 
the odd rows. None of these —1,-1 are in the last column, a coefficient of 
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In the odd rows we have pairs of consecutive (—1,-1) which overlap 
entries from the even rows, but not themselves, including a single (—1,-1) 
pair which fills in one spot in the last column, the coefficients of x^-2- 

For example, the word w = abc . . . abc . . . , where all 2k — 1 letters a,b,c,... 
are repeated alphabetically twice, is in the class of the partition words under 
consideration. The corresponding system of equations is 

X + Xl= X 2 k-l +X 2 k 

Xi + x i+ i = x 2 k+i~i + x 2 k+i, i = 1,2,..., 2k -3, 

, X2k-2 + X 2k -\ = £ 4fc _ 3 + X 4 fc_2, 

and its matrix is 

1100. .00-1-1 ... 00 

0110. .00 0-1-1 ... 

0011. .00 0-1 ... 00 

0000. . 11 ...-10 
0000. .01 1 . . .-1-1 

All other partition words in our class are obtained from permuting letters 
w[2k],w[2k + 2], . . . , w[4k — 2], and then permuting letters w[2k + 1], w[2k + 
3], . . . , w[4k — 3] of w = abc. . .abc. Thus all other systems of equations are 
obtained from the above one by permuting even rows in columns 2k + 1, 2k + 
2, ... ,4k — 2 and odd rows in columns 2k, 2k + 1, . . . , 4k — 1 (apart from the 
1 at column 2k and row 2k — 1 which is never permuted, but gets eliminated 
if the first row permutes to become the last one). For each of these words 
the sum of all odd rows in the system minus the sum of all even rows is 
[1, 0, . . . , 0, — 1], implying that for such w the additional constraint xq = x^-2 
we require when computing ph{w) is merely a consequence of (3.6). 

The solutions of equations (3.6) for such partition words w are easy to 
analyze due to parity considerations. Gaussian elimination consists here of 
subtractions of the given row from the row directly above it, starting with 
the subtraction of the {2k — 1) row and ending with the subtraction of the 
second row from the first row, at which point the first 2k — 1 columns become 
the identity matrix. During these subtractions, a —1 entry in each column 
of the original system can meet a nonzero entry only from a row positioned 
at an odd distance above it, in which case they cancel each other. So as 
we keep subtracting, all coefficients take values 0, ±1 only. Further, for each 
row the sum of the entries in columns 2k, . . . ,4k — 1 is —2, except for the 
last row for which it is —1. Thus, after all subtractions have been made, 
these sums are —1 at each of the rows. We can now set the 2k undetermined 



HANKEL, MARKOV, TOEPLITZ MATRICES 



37 



variables to i.i.d. U[0, 1] random variables, X2k-i = Uq, . . . , x<y--2 = U^k-i, 
and solve the 2k — 1 equations for the dependent variables xq, . . . ,X2k-2- 
By the above considerations we know that each of these dependent random 
variables is expressed as an alternating sum of independent uniform U[0, 1] 
random variables of the form (A. 3). 

The argument we used for deriving (A. 5) thus gives the bound ph{w) > 
^(2ke)~ 2k for each of these k\(k — 1)1 partition words, and hence for all k 
large enough, we have 

m 4k . 2 ( lH ) > \k\{k - l)\(2eky 2k > (6er 2k . 

l/k 

Thus rn ik _ 2 — > oo, which implies that the support of 7# is unbounded. □ 

Proposition A. 3. The free convolution jm = 70 EB71 of the standard 
semicircle distribution 70 and the standard normal 71 is a symmetric mea- 
sure, determined by moments, has unbounded support and a smooth bounded 
density. 

Proof. By Corollary 2 in [2], jm has a density, by Corollary 4 in [2] 
the density is smooth and by Proposition 5 in [2] it is bounded. 

We now verify that jm is determined by moments and has unbounded 
support. We need the following observation: a probability measure fi has odd 
moments vanishing iff the odd free cumulants A^r+if^) of \x vanish. This can 
be easily read from formula (72) in [22]. 

Since free cumulants linearize the free convolution, fc r (7A/) = ^(70) + 
k r (-yi). This shows that the odd moments of vanish. Recall that the 
free cumulants k n (/j>) and the moments m n ((i) of a probability measure \jl 
are related by formula (72) in [22]. In particular, for [i with vanishing odd 
moments, the even cumulants k2r{lj) are related to the moments by the 
equations 

n 2r 

(A.6) m 2n (/i) =Y^k2r(fj) E[ m %(' u )' n = l,2,.... 

r=l «iH h«2r=2n— 2r j=l 

By symmetry, the odd cumulants of 71 vanish, and fc2r(7i) are nonnega- 
tive; &2r(7i) count all irreducible pair partitions of {1, ... , 2r} (see [5], page 
152). Since ^2(70) = 1, and all higher free cumulants of 70 vanish (see [12], 
Example 2.4.6), we have 

far hi) < farilKl) < 2/c 2r (7l). 

Together with (A.6) this implies by induction that 

m2r(7i) < rri2r{lM) < 4 r m2 r (7i). 

In particular, 74/ has unbounded support and is uniquely determined by 
moments. Since its odd cumulants vanish, the odd moments vanish and 7^/ 
is symmetric. □ 
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A. 2. Moments of free convolution. In this section we identify moments 
of the free convolution 70 EB 71 . The result and the method of proof were 
suggested by Bozejko and Speicher [5], who give a combinatorial expression 
for the moments of free convolutions of normal densities. 

Denote by W the set of all partition words. Recall that a (partition) 
subword of a word w is a partition word w\ such that w = a. . . cw\d. . z. Let 
Wo be the set of all irreducible partition words, that is, words that have no 
proper (nonempty) partition subwords. 

Definition A. 1 ([5]). We say that p : W — ► K is pyramidally multiplica- 
tive, if for every w £ W of the form w = a...cw\d..z, we have p(w) = 
p{wi)p{a . . .cd. . z). 

Lemma A. 4 ([5], page 152). Suppose that the moments are given by 
(A. 7) m 2n = 

w£W,\w\=2n 

and m2n-\ = 0, n = 1, 2, If the weights p(w) are pyramidally multiplica- 
tive, then the free cumulants are 

hn= p(w). 

w€Wo,\w\=2n 

Proposition A. 5. A symmetric measure jm with the even moments 
given by (3.1) is given by the free convolution jm = 70 EB 71. 

Proof. We apply Lemma A. 4 to measures 7m, 7o and 71. If w = ..w±.., 
then h(w) = h{w\) + h(w \ W\), so the Markov weights pm(w) := 2 h ^ are 
pyramidally multiplicative. It is well known that the moments of the normal 
distribution are given by (A. 7) with pi(w) = 1, which is (trivially) mul- 
tiplicative. The moments of the semicircle distribution are given by (A. 7) 
with po(w) = 1 for the so-called noncrossing words, and po(w) = otherwise. 
(A partition word is noncrossing, if it can be reduced to the empty word by 
removing pairs of consecutive double letters xx, one at a time.) It is well 
known that this weight is pyramidally multiplicative, too. 

We now use Lemma A. 4 to compare the free cumulants of the semicircle, 
normal and Markov distributions. Let w £ Wo- If \w\ = 2, then pm(w) = 2, 
and otherwise pm{w) = 2° = 1 as an irreducible word has no proper sub- 
words, and hence no encapsulated subwords. Thus &2(7m) = 2, and for n > 2 

k 2n {lM) = #{w £ Wo, \w\ = 2n}. 

If \w\ =2, then po(aa) = 1, and otherwise po(w) = as an irreducible word 
of length 4 or more cannot be noncrossing. Thus ^2(70) = 1, and for n > 2 
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From pi(w) = 1 we get 

fcan(7l) = #i w G Wo, H = 2n} 
for n > 1; in particular, £2(71) = 1. Thus, for n > 1 

&2n (7M ) = fc 2 n (70 ) + &2n (7l ) ) 

which proves that 7m = 7o EB 7i ■ □ 
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